Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 52(D1): D938-D949, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38000386

RESUMO

Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app.


Assuntos
Bases de Dados Factuais , Doença , Genes , Fenótipo , Humanos , Internet , Bases de Dados Factuais/normas , Software , Genes/genética , Doença/genética
2.
Bioinformatics ; 39(7)2023 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-37389415

RESUMO

MOTIVATION: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking. RESULTS: Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification. AVAILABILITY AND IMPLEMENTATION: https://kghub.org.


Assuntos
Ontologias Biológicas , COVID-19 , Humanos , Reconhecimento Automatizado de Padrão , Doenças Raras , Aprendizado de Máquina
3.
PLoS One ; 16(3): e0231916, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33755673

RESUMO

AVAILABILITY: The API and associated software is open source and currently available for access at https://github.com/NCATS-Tangerine/translator-knowledge-beacon.


Assuntos
Conhecimento , Software , Bases de Dados Factuais , Internet
4.
Proc Natl Acad Sci U S A ; 106(30): 12273-8, 2009 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-19597147

RESUMO

Rice, the primary source of dietary calories for half of humanity, is the first crop plant for which a high-quality reference genome sequence from a single variety was produced. We used resequencing microarrays to interrogate 100 Mb of the unique fraction of the reference genome for 20 diverse varieties and landraces that capture the impressive genotypic and phenotypic diversity of domesticated rice. Here, we report the distribution of 160,000 nonredundant SNPs. Introgression patterns of shared SNPs revealed the breeding history and relationships among the 20 varieties; some introgressed regions are associated with agronomic traits that mark major milestones in rice improvement. These comprehensive SNP data provide a foundation for deep exploration of rice diversity and gene-trait relationships and their use for future rice improvement.


Assuntos
Variação Genética , Genoma de Planta/genética , Oryza/genética , Polimorfismo de Nucleotídeo Único , Mapeamento Cromossômico , Cromossomos de Plantas/genética , Frequência do Gene , Genótipo , Dados de Sequência Molecular , Oryza/classificação , Filogenia , Locos de Características Quantitativas/genética , Análise de Sequência de DNA , Especificidade da Espécie
5.
Nucleic Acids Res ; 36(Database issue): D943-6, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17933772

RESUMO

The Generation Challenge Programme (GCP; www.generationcp.org) has developed an online resource documenting stress-responsive genes comparatively across plant species. This public resource is a compendium of protein families, phylogenetic trees, multiple sequence alignments (MSA) and associated experimental evidence. The central objective of this resource is to elucidate orthologous and paralogous relationships between plant genes that may be involved in response to environmental stress, mainly abiotic stresses such as water deficit ('drought'). The web-based graphical user interface (GUI) of the resource includes query and visualization tools that allow diverse searches and browsing of the underlying project database. The web interface can be accessed at http://dayhoff.generationcp.org.


Assuntos
Produtos Agrícolas/genética , Bases de Dados Genéticas , Genes de Plantas , Produtos Agrícolas/metabolismo , Desidratação , Meio Ambiente , Perfilação da Expressão Gênica , Internet , Filogenia , Proteínas de Plantas/química , Proteínas de Plantas/classificação , Alinhamento de Sequência , Interface Usuário-Computador
6.
Plant Physiol ; 139(2): 637-42, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16219924

RESUMO

Ambiguous germplasm identification; difficulty in tracing pedigree information; and lack of integration between genetic resources, characterization, breeding, evaluation, and utilization data are constraints in developing knowledge-intensive crop improvement programs. To address these constraints, the International Crop Information System (www.icis.cgiar.org), a database system for the management and integration of global information on genetic resources and crop improvement for any crop, was developed by genetic resource specialists, crop scientists, and information technicians associated with the Consultative Group for International Agricultural Research and collaborative partners. The International Rice Information System (www.iris.irri.org) is the rice (Oryza species) implementation of the International Crop Information System. New components are now being added to the International Rice Information System to handle the diversity of rice functional genomics data including genomic sequence data, molecular genetic data, expression data, and proteomic information. Users access information in the database through stand-alone programs and Web interfaces, which offer specialized applications and customized views to researchers with different interests.


Assuntos
Bases de Dados Genéticas , Sistemas de Informação , Oryza/genética , Cruzamento , Biologia Computacional , Internet , Sistemas de Informação Administrativa , Metanálise como Assunto , Software
7.
Bioinformatics ; 20(2): 155-60, 2004 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-14734305

RESUMO

MOTIVATION: The high content of repetitive sequences in the genomes of many higher eukaryotes renders the task of annotating them computationally intensive. Presently, the only widely accepted method of searching and annotating transposable elements (TEs) in large genomic sequences is the use of the RepeatMasker program, which identifies new copies of TEs by pairwise sequence comparisons with a library of known TEs. Profile hidden Markov models (HMMs) have been used successfully in discovering distant homologs of known proteins in large protein databases, but this approach has only rarely been applied to known model TE families in genomic DNA. RESULTS: We used a combination of computational approaches to annotate the TEs in the finished genome of Oryza sativa ssp. japonica. In this paper, we discuss the strengths and the weaknesses of the annotation methods used. These approaches included: the default configuration of RepeatMasker using cross_match, an implementation of the Smith-Waterman-Gotoh algorithm; RepeatMasker using WU-BLAST for similarity searching; and the HMMER package, used to search for TEs with profile HMMs. All the results were converted into GFF format and post-processed using a set of Perl scripts. RepeatMasker was used in the case of most TE families. The WU-BLAST implementation of RepeatMasker was found to be manifold faster than cross_match with only a slight loss in sensitivity and was thus used to obtain the final set of data. HMMER was used in the annotation of the Mutator-like element (MULE) superfamily and the miniature inverted-repeat transposable element (MITE) polyphyletic group of families, for which large libraries of elements were available and which could be divided into well-defined families. The HMMER search algorithm was extremely slow for models over 1000 bp in length, so MULE families with members over 1000 bp long were processed with RepeatMasker instead. The main disadvantage of HMMER in this application is that, since it was developed with protein sequences in mind, it does not search the negative DNA strand. With the exception of TE families with essentially palindromic sequences, reverse complement models had to be created and run to compensate for this shortcoming. We conclude that a modification of RepeatMasker to incorporate libraries of profile HMMs in searches could improve the ability to detect degenerated copies of TEs. AVAILABILITY: The Perl scripts and TE sequences used in construction of the RepeatMasker library and the profile HMMs are available upon request.


Assuntos
Algoritmos , Elementos de DNA Transponíveis/genética , Documentação , Perfilação da Expressão Gênica/métodos , Genoma de Planta , Oryza/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Bases de Dados de Ácidos Nucleicos , Modelos Genéticos , Modelos Estatísticos , Software
8.
Bioinformatics ; 19 Suppl 1: i63-5, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-12855438

RESUMO

The International Rice Information System (IRIS, http://www.iris.irri.org) is the rice implementation of the International Crop Information System (ICIS, http://www.icis.cgiar.org), a database system for the management and integration of global information on genetic resources and germplasm improvement for any crop. Building upon the germplasm genealogy and field data components of ICIS, IRIS is being extended to handle diverse rice genomics data including: genetic mapping, genome annotation, genotype, mutant, transcripteome, proteome and metabolomic data. Users can access information in the database through stand-alone programs and WWW interfaces offering specialist views to researchers with different interests.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Armazenamento e Recuperação da Informação/métodos , Oryza/genética , Oryza/metabolismo , Software , Interface Usuário-Computador , Perfilação da Expressão Gênica/métodos , Genótipo , Disseminação de Informação/métodos , Internacionalidade , Oryza/classificação , Fenótipo , Proteínas de Plantas/classificação , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Design de Software , Integração de Sistemas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...